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Abstract 


The  deployment  of  improvised  explosive  devices  (IEDs)  along  major  roadways  has  been  a 
favored  strategy  of  insurgents  in  recent  war  zones,  both  for  the  ability  to  cause  damage  to  targets 
along  roadways  at  minimal  cost,  but  also  as  a  means  of  controlling  the  flow  of  traffic  and  causing 
additional  expense  to  opposing  forces.  Among  other  related  approaches  which  we  discuss,  the 
adversarial  problem  has  an  analogue  in  the  Canadian  Traveller  Problem,  wherein  a  stretch  of 
road  is  blocked  with  some  independent  probability,  and  the  state  of  the  road  is  only  discovered 
once  the  traveller  reaches  one  of  the  intersections  that  bound  this  stretch  of  road. 

We  discuss  the  implementation  of  ideas  from  social  network  analysis,  namely  the  notion  of 
“betweenness  centrality,”  and  how  this  can  be  adapted  to  the  notion  of  deployment  of  IEDs 
with  the  aid  of  Generalized  Linear  Models  (GLMs):  namely,  how  we  can  model  the  probability 
of  an  IED  deployment  in  terms  of  the  increased  effort  due  to  Canadian  betweenness,  how  we 
can  include  expert  judgement  on  the  probability  of  a  deployment,  and  how  we  can  extend  the 
approach  to  estimation  and  updating  over  several  time  steps. 

This  is  a  final  report  of  research  carried  out  under  a  DARPA  Contract,  and  integrates  prior 
progress  reports  by  the  authors  with  subsequent  project  work  available  as  Thomas  and  Fienberg 
[2010].  It  also  includes,  in  Appendix  B,  a  list  of  papers  and  technical  reports  on  related  network 
topics  developed  by  us  during  the  research  projec. 
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1  Introduction 


Vehicles  traverse  a  network  of  roads  which  may  be  compromised  by  an  adversary  with  the  placement 
of  improvised  explosive  devices  (IEDs).  At  a  minimum,  compromised  roads  can  be  avoided  and 
replaced  by  alternate  routes,  though  more  typically,  a  convoy  will  spot  a  suspected  IED  while 
already  on  the  road,  which  will  require  time  and  effort  to  disarm.  When  considering  the  information 
available  to  the  drivers,  the  routers  and  the  adversary,  this  can  lead  to  a  complex  game-theoretic 
scenario.  When  modeling  the  placement  of  an  IED  as  a  stochastic  process,  so  that  the  adversary 
places  IEDs  without  regard  to  the  response  of  the  protagonists,  we  can  set  the  problem  in  a 
decision-theoretic  framework,  as  in  Singpurwalla  [2008b, a]. 

A  somewhat  more  systematic  and  technical  description  of  the  problems  at  hand  has  the  following 
components: 

•  We  have  a  pre-existing  road  system  with  known  travel  times  and  capacities,  or  these  can  at 
least  be  approximated.  This  road  system  represents  a  network  at  its  highest  capacity. 

•  An  adversary  can  interfere  with  this  road  system;  this  interference  effects  changes  in  the 
traversable  graph,  possibly  (likely)  in  the  middle  of  the  traversal  process. 

•  We  have  a  single  convoy  of  vehicles  (possibly  a  single  vehicle)  traveling  from  one  node  to 
another  on  the  graph.  This  convoy  can  be  divided  into  multiple  sub-units,  which  may  take 
separate  paths,  with  some  additional  cost  for  security  but  some  additional  benefit  for  increased 
probability  of  “success.” 

•  The  goal  is  to  minimize  some  objective  function  based  on  the  time  of  travel,  expenses  for 
protection  and  actual  transport  (fuel,  escort,  etc.)  and  the  cost  of  the  loss  of  human  life.1 

For  the  sake  of  a  clear  definition  of  the  problem,  we  make  some  additional  assumptions: 

•  Only  roads  can  be  the  targets  of  IEDs,  not  intersections  or  places  of  interest.  This  is  done 
merely  for  clarification  of  the  methods  at  hand,  and  is  in  no  way  a  restriction  of  the  capabilities 
of  the  modeling  approach. 

•  To  “block”  a  road  is  to  extend  the  time  it  would  take  to  traverse  it  (as  a  clearing  effort  can 
be  brought  in)  or  to  cause  some  level  of  damage  to  anyone  trying  to  traverse  it  (by  ignoring 
the  clearing  option);  this  investigation  assumes  that  a  road  will  not  be  traversed  unless  it  is 
“clear” . 

We  propose  a  general  method  for  incorporating  this  problem  into  a  family  of  methods  based  on 
the  Generalized  Linear  Model,  both  in  its  static  and  dynamic  forms,  and  by  modifying  concepts 

1We  propose  this  without  judgement  on  how  to  choose  this  function. 
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from  the  field  of  Complex  Networks  so  that  they  can  be  incorporated  into  a  GLM.  This  method 
is  extremely  flexible  and  allows  for  a  large  number  of  predictors  and  concepts  to  be  added  with 
minimal  additional  construction. 

We  begin  in  Section  2  with  a  review  of  the  literature  from  Operations  Research  that  is  relevant 
to  the  problem  (other  connecting  literature  is  reviewed  in  Appendix  A.)  We  then  introduce  concepts 
in  complex  networks  in  Section  3  that  are  relevant  to  the  problem,  namely  centrality  measures  of 
nodes  and  edges,  before  introducing  our  synthesis  of  these  ideas,  Canadian  Betweenness  Centrality, 
in  Section  4.  We  show  how  to  integrate  these  ideas  with  GLMs  in  Section  5  for  the  static  case, 
before  discussing  dynamic  extensions  in  Section  6. 

We  conclude  with  a  discussion  on  the  integration  of  expert  knowledge  in  Section  7,  followed  by 
a  short  discussion  on  the  improvements  that  we  believe  would  enhance  this  approach  in  Section  8. 

2  Literature  Review 

2.1  Vehicle  Routing  in  Transportation  Research 

Bertsimas  and  Simchi-Levi  [1996]  provide  an  overview  of  vehicle  routing  methods  from  a  determin¬ 
istic  view,  extending  these  ideas  to  a  stochastic  or  dynamic  framework.  They  quote  a  canonical 
example:  a  utilities  network  is  prone  to  failures  that  impede  transmission;  said  failures  vary  in 
magnitude,  timing  and  location  according  to  some  process.  At  the  same  time,  maintenance  units 
must  make  use  of  a  transportation  network  to  make  repairs,  so  that  the  goal  is  to  minimize  the 
total  system  downtime  by  being  efficiently  deployed. 

Figure  1  shows  several  examples  of  this  class  of  problem.  In  the  simplest  case,  the  fleet  consists 
of  a  single  vehicle  anticipating  failure  at  one  of  two  locations.  As  the  number  of  locations  and 
vehicles  grows,  ideal  placement  of  the  vehicles  in  the  fleet  requires  the  incorporation  of  several 
different  elements: 

•  The  impact  of  several  failures  at  once  in  the  whole  system,  including  their  integrated  behavior. 
For  example,  in  a  communications  system,  two  adjacent  nodes  suffering  failure  may  be  no 
worse  than  the  failure  of  one  of  them,  while  the  failure  of  two  separate  “bridge”  nodes  between 
two  large  groups  may  cripple  all  such  contact. 

•  The  minimization  of  travel  times  of  vehicles  to  respond  to  multiple  simultaneous  failures. 

•  The  connected  impact  of  one  failure  on  the  likelihood  of  a  failure  in  an  adjacent  node. 

If  the  nodes  are  connected  into  a  networked  system,  as  in  the  case  of  a  communications  network, 
then  the  solution  of  this  system’s  network  properties  (a  stochastic  problem)  will  be  required  to  solve 
the  deterministic  vehicle  deployment  issue;  for  example,  that  a  maintenance  unit  should  be  placed 
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Figure  1:  Three  vehicle  routing  scenarios.  In  each  scenario,  each  location  is  prone  to  failures 
requiring  the  presence  of  a  repair  vehicle;  these  failures  occur  at  stochastic  intervals  with  associated 
costs.  The  problem  to  solve  is  the  optimal  placement  of  repair  vehicles  to  minimize  the  costs 
incurred  by  failure.  In  the  first,  one  vehicle  is  required  to  anticipate  failures  at  two  different 
locations;  these  scenarios  become  more  complicated  as  additional  locations  and  vehicles  are  added. 


to  respond  to  the  nearest,  most  critical  event  that  might  occur.  As  a  result,  this  does  not  necessarily 
require  a  joint  solution  of  the  two  problems  together. 

Bertsimas  and  Simchi-Levi  [1996]  refer  to  the  broader  area  of  dynamic  transportation  research. 
They  include  several  subcategories  relevant  to  our  problem: 

•  Dynamic  fleet  management  (Powell  [1986]  among  others).  A  fleet  of  vehicles  is  distributed  on 
the  nodes  of  a  network,  ready  to  assist  at  other  nodes  as  needed.  This  requires  an  algorithm 
to  determine  which  vehicles  should  be  dispatched  at  any  given  time  to  handle  a  service  request 
given  the  likely  distribution  of  later  events.  This  application  diverges  from  ours  as  we  only 
consider  a  single  source-destination  pair  at  any  one  time. 

•  Dynamic  traffic  assignment  (Friesz  et  al.  [1989]),  in  which  individual  units  make  decisions  that 
optimize  their  own  arrangements  through  traffic,  mitigated  by  a  decision-making  process  at 
each  node  along  the  way,  such  as  minimizing  a  chosen  cost  function,  as  in  this  reference, 
according  to  a  general  Lagrangian  equation  for  the  system.  This  literature  mostly  examines 
the  overall  picture,  in  that  different  drivers  make  choices  with  implicit  respect  to  one  another. 
In  this  consideration,  congestion  is  allowed  to  build  on  edges  over  time  even  if  node  traffic  ffow 
is  allowed  to  be  constant,  as  this  is  accounted  for  in  the  cost  function.  While  this  literature 
may  have  relevance  to  the  general  problem  we  tackle,  as  most  of  it  is  from  the  steady-state 
control  perspective  from  a  deterministic  toolkit  it  is  not  likely  to  be  directly  adaptable. 

•  Dynamic  shortest  path  problems  (Psaraftis  and  Tsitsiklis  [1993])  make  up  a  very  general 
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Figure  2:  A  traveller  on  the  graph  from  A  to  B  discovers  that  an  edge  is  “untraversable”  only 
after  arriving  at  one  of  its  endpoints,  and  is  forced  to  recalculate  the  optimal  path  based  on  this 
information.  This  is  the  general  premise  behind  stochastic  shortest  path  problems  with  recourse. 

class  of  algorithms.  In  the  cited  example,  the  “dynamic”  aspect  is  that  the  properties  of  the 
nodes,  rather  than  the  edges,  have  stochastic  properties.  While  the  choice  is  likely  made  for 
the  sake  of  keeping  nodes  as  the  independent  unit,  the  properties  of  edges  can  be  similarly 
modeled  in  our  approaches.  This  literature  tends  to  focus  on  the  computational  complexity  of 
determining  the  shortest  path,  rather  than  an  ensemble  of  short  paths  that  may  itself  include 
the  shortest  path,  which  is  consistent  with  our  problem  at  hand. 

Most  of  these  dynamic  problems  are  solved  through  linear  or  dynamic  programming,  allowing 
for  a  sizable  reduction  in  computing  effort.  It  is  not  immediately  clear  how  these  methods  can 
be  adapted  to  systems  where  the  variations  on  edge  properties  are  mutually  dependent,  though 
the  basic  framework  of  solving  simultaneously  for  all  possible  path  outcomes  is  one  that  remains 
consistent  in  this  framework. 

2.2  Stochastic  Shortest  Path  Problems  with  Recourse 

A  related  problem  to  that  under  our  investigation  is  the  Canadian  Traveller  Problem,  defined  in 
Andreatta  and  Romeo  [1988]  and  named  by  Papadimitriou  and  Yannakakis  [1991]  from  the  notion 
that  roads  may  be  closed  due  to  stochastic  intervention,  namely  a  rode  closure  due  to  sudden 
snow  blockage,  where  the  realized  state  is  only  discovered  when  reaching  one  of  the  connecting 
nodes.  The  discovery  of  the  shortest  path  is  therefore  determined  dynamically  as  the  system  is 
traversed;  the  “recourse”  nature  of  the  problem  is  the  technical  term  for  dynamic  rerouting  during 
the  traversal.  This  is  the  nature  of  the  later  work  by  Polychronopoulos  and  Tsitsiklis  [1996]. 

This  problem,  as  demonstrated  in  Figure  2,  differs  slightly  from  scenarios  where  an  arc  traversal 
time  is  stochastic  but  finite,  as  it  may  necessitate  a  reversal  over  a  previously  travelled  arc  for  a 
finite  solution  to  exist. 
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Among  the  investigations  of  this  problem  is  Karger  and  Nikolova  [2008],  dealing  with  this 
problem  by  addressing  scenarios  for  exact  computational  solutions  to  the  problem  when  they  are 
known  to  exist,  such  as  directed  acyclic  graph  structures;  these  cases  do  not  necessarily  translate 
to  our  current  context. 

Bnaya  et  al.  [2009]  add  the  possibility  of  information  gained  by  remote  sensing;  that  is,  the 
integration  of  non-local  information  gathering  on  the  state  of  the  system,  under  a  simple  information 
cost  model.  This  approach  may  be  integrated  at  some  later  stage  with  the  aeriel  detection  systems 
proposed  by  Royset  and  Reber  [2008]  and  discussed  later  in  this  review. 

Croucher  [1978]  deals  with  a  somewhat  related  problem:  the  case  when  an  acyclic  graph  is  known 
but  the  path  selection  mechanism  is  stochastic.  Namely,  suppose  that  each  arc  (i,j)  emanating 
from  node  i  has  distance  Dt]  and  traversal  probability  pij  (in  the  standard  problem,  ptj  =  1  for  all 
traversable  arcs),  and  that  there  are  n,  outbound  arcs  from  node  i.  Then  given  that  it  is  decided  to 
traverse  arc  (i,  j),  the  probability  of  traversing  any  of  the  other  edges  (i,  k )  is  1  nPlJ .  This  scenario  is 
considerably  easier  to  solve  through  dynamic  programming  methods  than  others  we  have  examined 
so  far,  but  its  applicability  is  less  direct  to  the  problem  at  hand. 

Additionally,  Papadimitriou  and  Yannakakis  [1991]  examine  when  the  graph  is  completely  un¬ 
known  but  embedded  in  a  spatial  manifold  (such  as  a  two-dimensional  map)  and  the  trajectory 
and  distance  to  the  goal  is  known,  so  that  the  goal  is  to  produce  a  general  strategy  for  traversing 
the  graph  that  would  minimize  the  total  distance  and/or  cost  to  the  traveller. 

This  is  a  member  of  a  more  general  class  of  problems  in  reliability  theory.  Rather  than 
searching  for  single  optimal  solutions  to  shortest  path  problems,  the  purpose  of  a  reliability  study  is 
to  determine  how  a  system  responds  to  various  failures.  In  this  context,  we  seek  to  have  estimates 
of  the  robustness  of  a  system  when  certain  paths  become  unavailable,  or  at  a  minimum  more 
expensive. 

On  a  very  broad  scale,  Banavar  et  al.  [1999]  covers  scaling  relationships  between  the  size  of  a 
networked  system  and  the  flow  rates  seen  within.  This  notion  of  “allometric  scaling”  is  a  broad  look 
at  estimating  travel  time  along  a  network  given  the  size.  Other  examples  of  large-scale  resilience 
estimation  are  listed  in  Dorogovtsev  and  Mendes  [2003]  though  the  focus  is  largely  on  grand- 
scale  asymptotic  results  for  classic  families  of  network  generation,  namely  the  Erdos-Renyi-Gilbert 
random  graph  [Erdos  and  Renyi,  1959;  Gilbert,  1959],  the  Watts-Strogatz  “small  world”  framework 
[Watts  and  Strogatz,  1998],  and  the  Barabasi- Albert  “scale-free”  construction  [Barabasi  and  Albert, 
1999].  For  a  more  complete  review  of  the  statistical  network  literature  see  Goldenberg  et  al.  [2010]. 

Appearing  earlier  in  the  literature  is  Frank  [1969],  covering  a  directly  related  problem:  al¬ 
gorithms  for  calculating  the  distribution  of  the  shortest  path  in  a  graph  if  the  edge  lengths  are 
stochastic  (but  finite)  in  nature.  While  only  a  rough  guide  to  the  process  of  solving  the  problem, 
this  was  clearly  ahead  of  its  time  in  thinking  about  this  sort  of  problem  in  detail. 

Finally,  we  note  that  the  Canadian  Traveller  Problem  was  also  explored  under  the  name  “bridge 
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problem”  by  Blei  and  Kaelbling  [1999]  due  to  the  isomorphism  between  bridges-between- islands  and 
roads-between-intersections. 

3  The  Perspective  of  Complex  Networks 

The  language  of  complex  networks  is  well-suited  to  problems  involving  valued  graphs  such  as  a 
road  system,  where  intersections  can  be  seen  as  nodes  and  the  roads  are  what  connect  them. 
Namely,  a  road  network  is  considered  to  be  a  weighted  graph  G(V,  E ),  where  a  vertex  V)  represents 
an  intersection  between  roads,  typically  a  point  of  interest;  the  weight  of  an  edge  Eij,  between 
intersections  i  and  j,  represents  the  distance  between  these  points  and  the  cost  associated  with 
traversing  an  edge  in  the  problem  of  optimizing  the  total  expense  of  a  graph  traversal.  In  general, 
we  can  represent  a  point  of  interest  on  a  road  as  an  intersection  with  degree  2  -  that  is,  a  point 
along  the  road  of  interest  that  divides  the  road  in  two. 

If  there  is  a  non-zero  probability  that  this  road  is  blocked,  and  that  the  blockage  can  only  be 
discovered  once  we  reach  one  end  of  the  road,  then  we  have  the  essence  of  a  Canadian  Traveller 
Problem,  since  the  goal  is  to  calculate  the  optimal  route  plan  between  two  points,  including  all 
contingency  plans.  Before  we  introduce  the  CTP  into  the  system,  we  first  review  how  to  model  the 
graph  of  roads  with  IED  deployments,  as  a  subset  of  the  greater  road  network,  using  the  language 
of  Generalized  Linear  Models. 

The  key  to  the  approach  that  we  will  use  is  that  we  can  model  the  probability  that  any  particular 
stretch  of  road  will  have  an  IED  deployment  during  a  particular  time  interval,  and  that  past  events 
will  be  the  key  to  this  modeling.  In  particular,  we  assume  that  there  are  properties  of  the  roads 
that  lend  themselves  to  deployments,  both  “local”  in  the  sense  of  activities  along  the  road  itself, 
and  “global”  in  the  sense  that  the  road  holds  importance  as  a  connection  between  other  places  in 
the  network. 

3.1  Generalized  Linear  Model  Specifications  for  Deployment  Probability  Cal¬ 
culations 

Let  i. j  G  {1,...,IV}  index  the  nodes/intersections  in  the  road  system:  (i,j)  then  represents  the 
direct  road  from  i  to  j  if  it  exists.  Given  that  a  road  exists  between  two  intersections  (let  an 
indicator  Itj  equal  1  if  the  road  exists),  the  length  of  a  road  can  be  given  as 

Considering  a  family  of  models  for  estimating  the  probability  that  a  road  has  ever  had  an 
IED  deployment  using  a  Generalized  Linear  Model,  namely  a  probit  specification.  This  can  be 
extended  to  other  scenarios,  such  as  a  time-dependent  structure  in  Section  6,  and  incorporating 
the  tactical  position  of  the  adversary  as  in  Singpurwalla  [2008b];  for  this  section,  the  use  of  the 
“one-off”  specification  is  for  illumination.  (See  McCullagh  and  Nelder  [1989]  for  more  on  GLM 
specifications,  particularly  for  binary  data.) 
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The  general  structure  for  the  specification  of  a  single  road  given  that  the  road  exists,  is 

that  a  deployment  was  previously  detected  if  Yij  =  1.  If  the  probability  of  a  previous  deployment 
is  pij,  then  ~  Be(pij)  so  that 


= 

(baseline  rate) 

(1) 

+ 

Ai 

(rate  given  intersection  i) 

(2) 

+ 

Aj 

(rate  given  intersection  j ) 

(3) 

+ 

Bij 

(properties  of  edge  (i,j)). 

(4) 

It  is  the  specification  of  each  of  these  terms,  including  covariates  on  intersections  and  edges, 
that  allows  us  to  gauge  their  historical  correlation  with  IED  deployment  likelihoods,  and  to  lead  to 
future  predictions  of  deployment  behavior. 

Intersection  and  Road  Factors 

There  are  two  basic  categories  of  inputs  at  intersections.  Let  Xj  be  a  vector  of  properties  of  the 
intersection  itself  as  a  place  of  importance,  such  as  proximity  to  a  government  building,  school, 
mosque  or  other  landmark  of  interest.  This  allows  us  to  distinguish  the  globally  defined  properties 
of  the  intersection,  which  we  label  Z%.  which  derive  from  the  nature  of  the  intersection  in  the 
traversal  of  the  network  itself. 

There  may  also  be  unobserved  reasons  for  the  domination  of  the  intersection  itself,  which  would 
suggest  that  a  node-specific  intercept  term,  either  unpooled  (each  r,;  is  determined  independently, 
and  all  together  sum  to  zero)  or  partially  pooled  (t;|ct  ~  IV(0,  a2),  and  the  best  estimates  for  r  are 
parametrically  shrunk  toward  zero),  Depending  on  the  frequency  of  past  deployments,  it  may  not 
be  wise  to  include  a  node-specific  intercept  term  in  this  equation,  particularly  if  there  are  several 
intersections  whose  roads  have  never  seen  a  deployment. 

All  together,  these  terms  can  be  collected  as 

Ai  =  XiOt  +  Zi/3. 

Each  of  these  characteristics  of  intersections  can  also  be  present  of  the  roads  that  lead  to  them. 
Let  Xij  identify  a  vector  of  local  properties  of  the  road  itself  (such  as  proximity  to  a  local  building 
of  interest),  and  let  ZVJ  identify  those  properties  of  the  road  that  are  due  to  the  global  nature  of 
the  road  structure.  Collecting  these  terms,  we  have 

—  X ij'y  -f*  ZijS. 
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We  act  as  if  the  local  properties  of  the  road  or  intersection  are  known  to  the  collector  of  the  data, 
as  in  Singpurwalla  [2008b];  in  the  next  section  we  describe  a  number  of  possible  global  properties 
that  measure  the  relative  importances  of  nodes  and  edges  in  social  network  analysis  that  serve  as 
the  template  for  the  importances  of  intersections  and  roads  to  transit. 

3.2  Including  Traditional  Centrality  Measures  As  Predictors 

Social  network  analysis  suggests  two  particular  approaches  to  considering  a  node’s  importance  to  a 
network:  closeness,  or  the  ability  to  reach  other  nodes  with  a  minimum  of  travel;  and  betweenness, 
or  the  importance  of  a  node  as  it  lies  between  the  transit  of  two  other  nodes.  While  the  latter  is 
clearly  the  definition  of  most  importance  to  the  process  on  our  network  of  interest  -  the  role  that 
roads  play  is  literally  that  of  betweenness  between  two  points  of  interest  -  it  is  worth  mentioning 
the  role  that  closeness  measures  can  play  as  an  input  for  the  nodal  term. 

Degree  and  Closeness 

The  most  basic  form  of  closeness  for  a  node  is  the  number  of  other  nodes  with  which  it  is  in  direct 
contact,  which  is  known  in  a  binary  network  as  the  “degree”  of  a  node.  In  this  case,  it  would 
represent  the  number  of  distinct  roads  that  lead  away  from  an  intersection  -  two  at  a  point  of 
interest  along  a  road,  three  at  a  “T”  junction,  four  where  two  roads  cross,  and  so  forth.  One  can 
include  the  degree  of  each  intersection  as  a  component  of  the  Z%  vector  to  check  if  a  road  attached 
to  a  particular  intersection  is  more  likely  to  have  a  deployment. 

This  concept  is  likely  of  lesser  use  than  the  idea  of  closeness  centrality,  a  measure  of  the  average 
distance  between  a  node  and  all  other  nodes  in  the  network  system.  The  typical  measure  of  distance 
in  this  case  is  the  geodesic  distance,  or  the  shortest  path  from  node  i  to  node  j,  symbolized  as  d(i,  j); 
the  traditional  measure  of  closeness  centrality  is  then  the  inverse  average  distance  from  node  i  to 
all  others,  as  shown  in  Freeman  [1979], 

71  —  1 

(i)  =  ^  ,rn- 

By  taking  the  geodesic  distance,  we  can  of  course  assume  that  there  is  always  available  a  shortest 
path  of  this  length,  and  that  this  will  always  be  the  preferred  path  that  any  traveller  will  take. 
Modifications  of  the  distance  term  can  be  made  if  necessary,  so  long  as  the  distance  between  any 
two  points  remains  finite,  or  that  the  term  will  include  some  sort  of  penalty  for  waiting  for  a  repair 
of  the  road  so  that  the  journey  can  continue. 


Betweenness 


The  notion  of  betweenness  stems  from  the  importance  of  a  node  (or  edge)  in  its  placement  for  the 
transit  between  other  nodes  in  the  system.  As  disrupting  this  transition  is  one  of  the  goals  in  IED 
deployment,  adapting  betweenness  measures  to  the  problem  at  hand  is  likely  the  most  direct  way 
of  assessing  the  likelihood  of  a  deployment. 

Given  a  pair  of  nodes  that  serve  as  the  destination  and  source  (labelled  i  and  j  respectively), 
the  standard  geodesic  betweenness  of  a  node  k  is  measured  in  terms  of  all  geodesic  paths  that 
connect  i  to  j  that  contain  k.  Define  as  the  set  of  all  unique  paths  between  i  and  j  with  the 
geodesic  traversal  length  d(i,j)\  if  AVJ(k)  is  the  set  of  all  such  paths  that  contain  node  k  as  an 
intermediate  step,  then  the  “betweenness”  of  node  k  with  respect  to  path  (i,j)  is  the  fraction  of 
paths  that  contain  it,  |  Aj  ( /c)  | / 1  Al;j  \ .  The  overall  betweenness  measure  of  a  node  is  then  the  average 
of  this  measure  with  respect  to  all  pairs  of  nodes, 
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This  construction  assumes  that  all  pairs  of  nodes  are  equally  important,  and  that  a  traveller  will 
pick  uniformly  at  random  from  all  shortest  paths,  omitting  any  other  paths  that  may  be  marginally 
longer.  This  does  not  mean  that  the  construction  can  be  adapted  to  allow  for  other  eventualities. 

There  is  an  immediate  adaptation  to  the  importance  of  an  edge,  rather  than  a  node,  by  replacing 
the  node  k  with  the  edge  (k,  l)',  the  betweenness  of  an  edge  is  then  reflected  in  the  share  of  paths 
that  traverse  the  edge  ( k ,  l)  when  traveling  from  i  to  j,  defined  as  Aij({k,  /});  the  edge  betweenness 
is  then 
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The  removal  of  node  k  or  edge  {k,  1}  from  the  system  is  reflected  in  the  betweenness  statistic, 
but  it  does  not  in  itself  reflect  the  situation  in  the  system  after  removal  has  itself  happened.  For 
the  problem  under  consideration,  the  additional  travel  distance/time  required  in  case  the  road  is 
discovered  to  be  blocked  is  a  more  apt  measure  of  the  road’s  importance,  which  we  detail  in  the 
next  section. 


4  Canadian  Traveller  Betweenness:  How  Much  A  Single  Road 
Blockage  Changes  Travel  Time 

We  now  introduce  the  essence  of  the  Canadian  Traveller  Problem  to  this  modeling  approach.  While 
the  standard  definitions  of  closeness  and  betweenness  are  deterministic  in  nature,  the  notion  of 
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Figure  3:  When  traveling  from  node  A  to  node  D,  the  traveller  can  choose  the  top  route,  with 
distance/cost  2,  or  try  the  bottom,  which  has  2  equally  probable  outcomes:  the  path  BD  is  open, 
for  a  path  ABD  and  cost  1.5,  or  the  path  is  closed,  leading  to  a  path  ABCD  and  cost  2.4. 

Canadian  Traveller  Betweenness  for  a  road/edge  is  an  expected  value  for  cases  when  an  edge  has  a 
particular  stochastic  property,  that  the  time  to  traverse  it  will  be  dependent  on  an  uncertain  event 
(the  deployment  of  an  IED)  than  can  only  be  observed  once  one  end  of  it  has  been  reached. 

First,  we  review  how  optimal  paths  are  found  between  any  two  points  of  interest  in  a  trans¬ 
portation  network  when  all  road  conditions  are  known  using  Dijkstra’s  algorithm,  then  show  how 
this  extends  to  the  Canadian  Traveller  Problem  specification. 

After  this  review,  we  put  forth  the  version  of  the  problem  that  we  find  most  compelling  for 
thinking  about  this  problem:  assessing  the  importance  of  a  road  in  terms  of  travel  from  a  source 
to  a  destination.  By  considering  the  traveller  problem  focusing  on  one  road  at  a  time  -  solving 
the  problem  with  the  road  certain  to  be  blocked  (but  unknown  to  the  traveller),  and  with  the 
road  certain  to  be  unblocked  (again,  unknown  to  the  traveller)  -  we  assess  the  importance  of  the 
road  to  travel  in  the  system  itself,  similar  to  the  nature  of  betweenness  centrality  as  explored  in 
the  previous  section.  This  will  then  be  integrated  into  the  Generalized  Linear  Model  approach  in 
Section  5. 

4.1  A  Simple  Example  For  A  Single  Source-Destination  Pair 

To  illustrate  the  challenges  in  this  problem,  consider  Figure  3,  adapted  from  Singpurwalla  [2008b]. 
The  goal  in  this  case  is  to  travel  from  node  A  to  node  D;  all  edges  except  BD  are  known  to  exist, 
while  edge  BD  may  not  exist  with  probability  0.5.  Suppose  that  the  goal  is  to  find  the  travel  plan 
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that  yields  the  minimum  average  travel  time  (noting  that  many  other  standards  are  also  acceptable) 
and  that  the  existence  of  BD  is  only  known  upon  arrival  at  B  or  D. 

A  traveller  taking  the  top  path,  A  to  C  to  D,  makes  a  journey  of  distance  2.  A  traveller  that 
tries  the  lower  path  will  find  a  short  road  with  probability  0.5,  and  a  total  journey  A  —  B  —  D  with 
distance  1.5,  or  no  direct  road  to  D,  forcing  a  detour  back  to  C  and  a  journey  A  —  B  —  C  —  D  with 
distance  2.6.  Marginally,  the  expected  length  for  the  traveller  choosing  to  try  B  is  1.95,  a  little  less 
than  the  traveller  trying  the  certain  route  through  C. 

This  construction  serves  to  demonstrate  the  stepwise  decision  process  that  must  be  made  by 
the  traveller:  a  move  toward  the  destination  commits  the  traveller  to  a  cost,  but  buys  information 
about  the  landscape  and  reduces  the  total  outcome  space.  Ahead  of  the  actual  transition  along 
the  graph,  the  user  must  assess  the  likelihood  of  each  path  being  free  or  blocked  before  making  a 
step  in  that  direction,  leading  to  a  trade-off  between  “discovery”  (the  benefits  of  learning  about 
the  system)  and  “progress”  (getting  closer  to  the  target  in  terms  of  the  unblocked  graph). 

4.2  Finding  Optimal  Paths  and  Travel  Plans 
Dijkstra’s  Algorithm  for  Shortest  Paths 

If  the  distance  between  all  connections  is  known,  the  algorithm  of  Dijkstra  [1959]  presents  the 
optimal  solution  for  the  shortest  path(s)  from  any  one  source  node  to  all  other  nodes.  The  essence 
of  the  algorithm  is  as  follows: 

•  Set  the  maximum  shortest  distance  of  all  nodes,  except  the  source  node,  to  infinity;  set  the 
source  to  zero.  Consider  three  classes  of  nodes:  “finished”,  “active”  and  “untested” ;  label  the 
source  node  as  “active”  and  the  others  as  “untested” . 

•  For  an  “active”  node,  note  all  direct  connections  to  “untested”  nodes.  For  each  connection, 
set  the  maximum  distance  of  the  untested  node  to  the  minimum  of  that  and  the  current 
maximum  distance  of  the  active  node  plus  the  tie’s  distance. 

•  Set  the  active  node  to  “finished”;  set  the  untested  node  with  the  lowest  maximum  distance 
to  “active”,  and  repeat  the  procedure  until  all  nodes  are  “finished”. 

This  procedure  gives  the  minimum  distance  from  any  other  node  to  the  source  node.  To  find 
the  shortest  path,  traverse  the  graph  backwards  by  selecting  the  next  node  as  that  whose  minimum 
distance  equals  the  current  node  minus  the  connecting  path  length. 

The  Cost  of  Direct  Application 

One  possible  route  to  applying  this  methodology  to  the  problem  is  through  complete  simulation.  If 
each  of  the  r  roads  has  only  two  possible  states,  then  there  are  at  most  2r  possible  instantiations  of 
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the  graph  to  check,  and  corresponding  shortest  paths  can  be  solved  for  each  solution.  The  challenge 
of  the  traveller  -  and  the  essential  breakdown  of  the  problem  -  is  in  deciding  whether  to  take  a 
single  step  along  the  most  likely  shortest  path,  or  to  take  a  more  roundabout  route  in  order  to 
gather  more  information  about  the  road  system. 

Either  of  these  methods  induces  a  smaller  Canadian  Traveller  Problem,  where  the  destination 
remains  the  same,  the  source  changes,  and  the  uncertainty  is  lowered.  In  the  end,  this  still  requires 
the  generation  of  a  large  number  of  possible  realizations  of  the  problem. 

Barring  the  development  of  a  complete  solution  to  the  problem,  we  are  left  with  the  same  tools 
we  had  to  begin  with:  to  examine  the  shortest  paths  under  each  possibility,  and  to  aggregate  the 
probabilities  of  the  respective  scenarios  in  order  to  choose  an  optimal  route  plan,  with  a  contingency 
at  each  step  for  whether  the  desired  path  remains  optimal.  The  use  of  remote  sensing,  as  proposed 
by  Bnaya  et  al.  [2009],  suggests  another  factor  to  consider:  whether  a  secondary  mechanism  can 
be  used  to  check  for  the  presence  of  a  deployment  ahead  of,  or  parallel  to,  the  main  traveller,  but 
this  remains  a  hypothetical  option  with  the  same  eventual  constraints  applying:  multiple  instances 
will  have  to  be  considered  and  integrated  into  the  final  solution. 

4.3  Defining  Canadian  Betweenness  Centrality  for  an  Edge 

A  road’s  importance  to  travel  can  be  thought  of  in  the  sense  of  a  potential  trip  length:  how  long 
the  journey  would  be  if  the  road  were  available,  compared  to  the  case  when  it  is  not  available,  when 
the  discovery  of  availability  can  only  be  made  when  one  of  its  endpoints  is  reached. 

This  differs  from  the  case  when  the  graph  layout  is  known  ahead  of  time.  Suppose  there  are 
multiple  shortest  paths  between  the  source  and  destination,  some  but  not  all  of  which  involve  a 
road  (z,j).  If  the  road  state  is  known  beforehand,  the  loss  to  the  traveller  on  this  path  is  zero, 
because  the  traveller  can  take  one  of  the  other  paths  and  maintain  the  same  travel  distance.  If  the 
traveller  picks  all  possible  paths  with  some  defined  probability  structure,  in  those  cases  where  the 
path  with  (i.  j)  is  chosen  will  result  in  an  increased  travel  distance. 

Thus,  define  the  Canadian  Betweenness  Centrality  of  an  edge/road  as  the  proportional  expected 
increase  in  distance  that  a  traveller  would  need  make  if  it  were  removed,  and  that  removal  were 
only  detected  by  the  traveller  when  arriving  at  one  of  its  endpoints. 

Consider  the  network  in  Section  4.1.  To  demonstrate  how  removing  one  of  these  five  edges  (and 
only  one)  would  affect  the  average  travel  distance  from  A  to  D,  given  that  the  average  length  of 
the  preferred  pathway  on  the  lower  track  is  1.95: 

•  A  removal  of  AC  would  be  discovered  instantly,  rather  than  in  transition,  and  would  cause 
the  first  move  A  —  B.  If  the  criterion  is  lowest  average  path  length,  only  an  irrelevant  path 
would  be  removed;  the  increase  in  distance  would  be  zero. 

•  Removing  AB  (also  discovered  instantly)  would  force  the  first  move  to  be  A  —  C.  The  traveller 
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can  then  head  straight  to  D  with  distance  1,  or  head  to  C  on  the  50%  chance  that  the  road  is 
open.  If  successful,  the  additional  travel  distance  is  0.9;  if  not,  the  traveller  must  head  back 
to  C  before  heading  to  D,  for  an  additional  distance  of  1.8.  The  average  path  length  in  this 
scenario  is  2.35,  so  that  the  traveller  would  be  wise  not  to  take  it.  The  increase  in  distance 
AcIad(A,  C)  is  0.4. 

•  Removing  BC  means  that  the  failure  of  the  bottom  path  would  force  the  traveller  to  return 
to  A  before  trying  the  top  path.  With  equal  probability  of  lengths  0.9  and  4,  and  average 
length  2.45,  the  increase  in  distance  AcIad{B,C)  is  0.5. 

•  Removing  CD  means  that  the  only  route  to  D  would  be  if  the  road  were  open.  Letting  the 
wait  time  for  a  road  repair  be  xr,  there  are  two  paths:  the  short  path  of  length  0.9,  and  the 
path  A  —  B  —  C — B  —  (wait)  —  D  of  length  2.3+xr,  for  an  average  of  A cIad{C,  D )  =  xr/2  —  0.35. 

•  If  BD  is  intact,  the  traveling  distance  is  0.9;  if  not,  the  distance  is  2.4,  for  a  total  increase  in 
distance  of  AcIad(B,  D)  =  1.5. 

Noting  that  each  of  these  terms  was  calculated  with  respect  to  the  source-destination  pair  {A,  D), 
the  same  betweenness  measure  can  be  calculated  for  all  pairs  of  nodes,  and  an  average  betweenness 
measure  can  be  calculated  for  each  edge  in  the  system.  The  specification  of  the  weights  for  each 
source-destination  pair  in  the  system  can  be  chosen  to  be  equal,  or  proportional  to  the  number  of 
trips  taken  per  pair,  or  some  other  scheme  chosen  by  the  implementer.  For  equal  weight,  define  the 
Canadian  Betweenness  Centrality  as 

CcB(iJ)  =  (n  -  l)(n  -  2)  S  E 
v  ’  k  l<k 

Given  that  the  relative  importance  of  each  road  is  now  calculated,  each  of  these  terms  can  be 
hypothetically  included  as  a  term  in  a  model.  This  depends  on  the  probability  of  a  deployment 
being  known  so  that  a  betweenness  measure  can  be  calculated  with  respect  to  all  other  ties  in  the 
system.  In  the  next  section  we  demonstrate  how  this  can  be  accomplished  through  an  iterative 
algorithm. 

5  Integrating  CTP  Estimates  with  GLM  Constructions 

Now  that  we  have  a  mechanism  for  describing  the  importance  of  a  road  to  travel  in  a  road  network, 
we  can  include  these  terms  in  the  GLM  model  for  the  likelihood  of  a  deployment  along  each  road. 
The  only  trick  is  that  the  deployment  probabilities  appear  on  both  sides  of  the  equation,  as  both  the 
outcomes  on  the  left  and  as  components  in  the  betweenness  factor  on  the  right.  Here  we  describe 
an  iterative  method  for  solving  for  the  importance  of  traveller  betweenness  in  the  deployment  of 
an  IED. 
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1.  For  each  road  choose  a  starting  value  for  the  probability  of  a  deployment.  For  compu¬ 

tation’s  sake,  pij  =  0  is  an  appropriate  (and  fast)  starting  point.  Hold  this  as  p[j. 

2.  Calculate  the  Canadian  betweenness  centrality  Bij  for  each  road  in  the  system  (i,j)  using 
the  current  deployment  probability  estimates  pf^ . 

3.  Solve  the  linear  model  system 

Yij  ~  Be{pij );  $  *( pij )  =  p  +  ( Xi  +  Xj)a  +  ( Z\  +  Zj)(3  +  Xij'y  +  Z^jd 

with  one  of  the  terms  in  the  Zjj  vector  equal  to  B^. 

4.  With  the  estimates  for  (p,  a,  /3, 7,  5),  calculate  the  new  deployment  probability  p\-. 

5.  Repeat  steps  2-4  until  the  deployment  probabilities  have  converged. 

As  proposed,  this  algorithm  is  by  no  means  fast;  the  Canadian  Traveller  Problem  is  #P-complete 
and  lacks  a  quick  solution,  making  this  algorithm  easiest  to  run  on  small  networks.  For  larger 
networks,  rather  than  using  the  full  space  of  possible  graphs,  one  may  instead  sample  from  a  subset 
of  the  2r  possible  graphs.  This  would  be  similar  to  the  pseudolikelihood  methods  used  to  simulate 
from  ERG  Ms  [Crouch  et  al.,  1998]. 

5.1  Other  Extensions  To  The  One  Time-Point  Case 

This  is  a  proposed  road  map  for  the  integration  of  Generalized  Linear  Model  methods  for  measuring 
and  pooling  information  on  IED  deployment  with  the  properties  of  the  road  system  itself.  There 
is  a  considerable  number  of  possible  improvements  and  developments  that  can  be  made  on  this 
framework. 

Likely  Travel  Paths 

The  relative  importances  of  these  roads  have  been  determined  on  the  assumption  that  the  shortest 
(average)  path  would  be  preferable  over  any  others.  Realistically,  there  is  also  no  guarantee  that  the 
traveller  will  take  the  shortest  path,  or  that  paths  of  slightly  longer  length  will  not  be  considered 
in  a  traveller’s  potential  plans.  Each  of  these  importance  measures  can  be  adjusted  to  consider  the 
utility  of  taking  longer  paths  by  chance,  and  estimating  the  additional  travel  caused  by  deployments 
on  that  basis. 

Adversarial  Deployment  Patterns 

These  decision  methods  have  been  made  with  the  assumption  that  the  deployment  of  IEDs  is 
stochastic  and  exogenous  in  nature,  not  under  the  control  of  an  adversary  that  can  take  the  actions 
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of  the  traveller  into  account  other  than  the  mathematical  properties  of  the  system.  While  there  is 
certainly  value  in  reducing  the  importance  of  a  traveller’s  route  choices  down  to  local  properties, 
it  remains  unclear  how  the  change  would  be  perceived  by  the  adversary  and  how  the  next  round 
of  deployments  would  be  affected. 

Singpurwalla  [2008b]  introduced  the  notion  of  changing  the  likelihood  function  to  reflect  this, 
particularly  in  the  notion  that  the  lack  of  a  past  deployment  makes  a  current  deployment  more 
likely.  The  current  specification  is  meant  as  a  template  for  extending  the  likelihood  function  for 
past,  present  and  future  deployment  mechanisms,  since  we  specify  the  ingredients  that  will  be 
introduced  into  the  likelihood  at  each  time  point. 

6  Dynamic  modeling 

As  outlined  to  this  point,  we  have  defined  this  method  in  terms  of  deployments  during  a  single 
time  interval;  deployment  probabilities  were  estimated  ex  post  facto  for  each  road  in  the  system 
given  their  local  and  global  properties.  The  use  of  the  GLM  framework,  however,  gives  us  a  natural 
method  to  extend  this  approach  to  the  dynamic  time  frame,  and  to  allow  new  information  to  come 
into  the  method: 

1.  Extending  the  approach  to  include  multiple-time- frame  analysis  in  the  GLM  approach  is 
straightforward.  Singpurwalla  [2008b]  introduced  the  network  routing  problem  through  the 
specification  of  a  time-dependent  likelihood  function,  in  the  sense  that  a  negative  autocorre¬ 
lation  might  be  present — the  lack  of  a  previous  deployment  along  a  particular  road  may  make 
a  current  deployment  more  likely  to  occur  in  the  present  frame  (all  else  being  equal).  One 
possible  scenario  is  to  assume  that  more  damage  occurs  on  roads  that  have  been  through  a 
lengthy  repair  process  following  a  detonation. 

2.  Outside  expert  opinion  can  be  introduced  to  the  estimation  process.  There  are  at  least  three 
ways  of  doing  this:  by  introducing  expert  opinion  as  a  covariate  in  the  model;  by  using  these 
opinions  as  a  mechanism  for  eliciting  a  prior  distribution  on  the  model  parameters,  and  as  a 
separate  estimate  that  can  be  averaged  with  the  model  in  some  principled  way. 

This  section  details  the  addition  of  these  characteristics  into  our  CTP-GLM  method  to  produce 
a  workable,  dynamic  method  for  improving  the  estimation  process,  as  well  as  the  predictive  power 
of  the  model  of  adversarial  behavior  given  past  actions.  The  use  of  Bayesian  updating  methods  is 
natural  for  this  class  of  model. 

The  decision  that  a  traveller  needs  to  make  entails  which  route  to  take  between  two  points 
on  a  map,  given  two  or  more  possible  paths  that  may  be  blocked  by  IED  deployments.  The 
decision-making  process  is  sequential  by  nature:  in  the  canonical  Canadian  Traveller  Process,  once 
an  intersection  is  reached,  the  state  of  all  its  connected  roads  is  known.  The  actor  can  augment 
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this  knowledge  through  the  use  of  by  advance  scouts,  remote  sensing  or  some  other  method  with 
an  associated  cost,  but  in  all  senses  there  is  still  a  lack  of  information  that  becomes  a  part  of  the 
decision  process. 

6.1  Bayesian  Updating  Over  Additional  Time  Periods 

One  ultimate  quantity  of  interest  is  p(Yt+i\Yi, ...,  Yt),  the  posterior  predictive  distribution  of  de¬ 
ployments  at  the  next  time  point  given  past  activity,  after  integrating  out  the  model  parameter 
estimates.  These  deployment  probabilities,  along  with  logistical  considerations,  will  govern  the 
route  selection  process. 

Recall  the  model  for  a  single  time  step, 

Yij  ~  J3c(pj  j)  : 


d*  1{Pij)  —  P  +  (Aj  +  Xj)a  +  ( +  Zj)(3  +  Xij'y  +  Zijd, 


and  consider  a  simple  multistep  version  of  the  model:  the  deployment  probabilities  on  each  street, 
conditional  on  their  characteristics  and  past  histories,  are  independent  and  identically  distributed. 

If  this  is  the  case,  given  a  fixed  number  N  of  time  periods,  the  number  of  deployments  on  each  road 
in  the  system  can  be  modelled  as  a  binomial  distribution,  with  a  simple  stepwise  updating  scheme: 

•  Begin  with  the  prior  distribution  for  the  model  parameters  p(p,  a,  (3, 7,  5). 

•  Given  the  first  time  period  of  deployments,  assemble  the  likelihood  for  the  data,  p(Y\  \p,  a,  (3,  7, 6) , 
which  under  conditional  independence  breaks  down  as 


Yl  P(YijMP,  a,  (3,-f,8). 

l<i<7<n 


Note  that  each  of  these  components  is  a  Bernoulli  random  variable. 

•  Using  this  decomposition,  solve  for  the  posterior  distribution  after  one  time  point,  p(p,  a,  (3 , 7,  d|  Yi), 
using  whichever  algorithm  is  desired  -  Gibbs  sampling,  variational  approximation,  particle 
filtering  -  making  sure  to  adjust  the  estimate  for  the  auxiliary  variable  set  p(j,  as  it  is  needed 

for  each  Canadian  betweenness  Bjj  included  in  the  Zj  and  Zl3  terms. 

•  Given  this  posterior  distribution,  repeat  these  steps  for  each  new  iteration  of  data  to  get  the 
next  posterior  distribution  p(/u,  a,  (3, 7,  <5|Yi,  I2),  noting  that  conditional  on  the  parameters 
(/j,,  a,  (3,1,5),  the  distribution  for  >2  does  not  depend  on  Y\. 


16 


If  the  form  is  as  simple  as  the  binomial,  this  updating  scheme  is  superfluous  to  the  process,  since 
we  can  typically  conduct  the  entire  operation  in  one  step,  from  the  prior  distribution  p{//,,  a,  (3 , 7,  5) 
directly  to  the  posterior  p{p,  a, /3, 7,  d|Yi, Y/v).  There  are  also  situations  where  the  stepwise 
approximation  will  be  lossy;  however,  there  are  cases  when  this  updating  scheme  may  prove  useful, 
such  as  when  the  dependence  structure  is  more  complicated. 

6.2  Sample  Time  Dependence  through  Explicit  Specification 

The  previous  method  suggests  the  assumption  that  a  deployment  on  a  particular  road  would  be 
independent  of  time  and  of  other  deployments  on  the  same  road  at  earlier  times,  an  assumption  that 
is  quite  likely  untenable  in  real  situations,  since  a  recent  deployment  would  probably  discourage 
more  of  these  in  the  immediate  future  (perhaps  due  to  heightened  vigilance,  the  effectiveness  of 
deployment  on  a  roadway  under  repair,  and  other  reasons  that  may  be  explored  by  substantive 
experts.)  Singpurwalla  [2008b]  suggests  modifications  to  the  likelihood  function  to  incorporate  this 
dependence  directly;  instead,  we  suggest  that  the  explicit  incorporation  of  previous  observations 
would  be  a  preferable  way  to  introduce  this  dependence. 

One  possible  incorporation  takes  a  Markovian  form:  the  deployment  in  one  time  period  depends 
explicitly  on  the  previous  period,  as 


Yijtt\ii,  a,  /?,  7,  S,  Yij)t_  1  ~  Be(pijtt)\ 


3?  1( Pij )  —  +  /r  +  ( Xi  +  Xj)a  +  (Zi  +  Zj)[3  +  Xij'y  +  ZijS, 

so  that  for  positive  values  of  r,  an  attack  the  previous  day  would  elevate  the  probability  on  the 
current  day,  and  that  this  increase  on  probability  would  be  identical  for  each  road  in  the  system 
(conditional  on  other  observed  characteristics.)  A  negative  r  would  correspond  to  a  decrease  in 
probability  of  an  event  if  one  had  occurred  in  the  previous  time  period. 

In  general,  this  can  be  extended  to  any  number  of  past  days  as 


a,  /?,  7,  S,  Yijtt- 1  ~  Beipij't)] 


K 

•h  1  (Pij )  =  ^2  TkYij)t-k  +  /r  +  {Xi  +  Xj)a  +  {Z{  +  Zj)P  +  Xij'y  +  Zij5 

k=  1 

to  include  any  additional  lags.  For  example,  a  series  of  negative  values  for  each  77,  decreasing 
in  magnitude  as  k  increases,  would  indicate  that  the  further  back  one  goes  in  time,  the  less  a 
past  deployment  would  affect  the  present,  so  that  after  a  time,  deployments  would  return  to  their 
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apparent  status  quo  rate. 


7  Elicitation  of  Expert  Knowledge 

While  the  model-based  approach  can  elucidate  a  good  deal  of  information,  a  primary  strength  of 
the  Bayesian  approach  is  the  ability  to  incorporate  other  information  into  the  predictive  framework. 
One  method  that  makes  the  inclusion  of  expert  opinion  explicit  is  elicitation  of  prior  distributions, 
(e.g.  Garthwaite  et  al.  [2005]).  We  suggest  a  simple  method  for  converting  from  expert  belief  on 
roads  into  prior  distributions  on  the  parameters  of  interest  on  the  model.  A  second  approach  is 
the  addition  of  expert  prediction  as  a  covariate  in  the  model,  though  this  makes  their  uncertainty 
of  their  prediction  harder  to  assess.  The  third  method  we  discuss  is  the  principled  of  a  parametric 
model  prediction  with  properly  calibrated  expert  opinion  in  the  predictive  phase  of  the  model. 

7.1  Expert  Loadings  as  Prior  Weights 

Elicitation  is  a  standard  method  for  including  expert  information  into  both  model  and  prior  spec¬ 
ification  [Garthwaite  et  al.,  2005].  By  asking  a  series  of  questions  of  the  expert,  we  can  obtain 
information  on  the  shape  of  the  probability  distribution  that  best  describes  their  beliefs  about 
likelihoods  of  events  or  strengths  of  associations.  We  can  then  convert  this  information  into  a 
distribution  on  the  model  parameters.  Ideally,  this  information  should  be  independent  of  anything 
known  about  the  data  under  observation,  such  as  the  particular  units  of  analysis  (in  this  case,  the 
roads  themselves)  but  we  can  still  adapt  it  to  such  circumstances  when  necessary. 

If  our  goal  is  to  learn  something  about  the  mechanisms  in  our  model,  such  as  the  ( a ,  (3, 7,  5) 
coefficients  on  local  and  global  properties,  then  the  wording  of  such  questions  may  be  difficult  to 
elicit  directly  -  asking  an  expert  about  increased  probabilities  conditional  on  a  covariate  may  be 
difficult  to  put  into  meaningful  words,  but  asking  an  expert  for  their  estimate  of  the  probability  of 
an  IED  deployment  on  a  particular  road  in  a  time  period  is  a  tractable  question.  This  forms  the 
basis  for  our  preliminary  method  for  expert  elicitation  with  respect  to  model  parameters. 

1.  Select  the  expert  from  whom  information  will  be  elicited.  Ask  a  series  of  “warm-up”  questions 
to  make  the  subject  comfortable  with  probability  assessments  and  uncertainties  (see  Tversky 
and  Kahneman  [1974]  for  an  overview  on  these  processes.) 

2.  For  each  road  (i.  j)  in  the  system,  query  the  expert  about  their  belief  in  the  probability  of  a 
deployment  of  an  IED  in  the  time  period  in  question,  as  well  as  their  uncertainty  about  these 
probabilities.  (See  Garthwaite  et  al.  [2005]  for  more  information  on  the  process  of  elicitation.) 

3.  Set  up  the  system  of  equations  corresponding  to 


•h  l{Pij )  —  T^ij,t- 1  +  /i  +  {X-i  +  Xj)a  +  ( Zi  +  Zj)f3  +  Xij'y  +  Zij5  +  e^-, 
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and  plug  in  the  probability  estimates  for  the  pij\  set  etj  ~  1V(0,  a2),  where  a2  is  an  auxiliary 
variance  parameter  for  this  procedure.  Solve  for  the  “posterior”  distribution  of  (p,  a,  f3, 7,  5\pij) 
under  this  model  beginning  with  a  flat  prior  distribution. 

4.  Replace  pij  with  a  draw  from  the  distribution  specified  by  the  expert.  Repeat  the  procedure. 

5.  Repeat  the  last  step  a  large  number  of  times  until  a  series  of  distributions  has  been  obtained. 
Take  the  average  of  these  “posteriors”  and  label  this  as  the  elicited  prior  distribution  for  this 
expert. 

We  think  of  this  method  as  somewhat  of  a  template  that  we  can  alter  and  refine  in  many 
different  ways.  Using  this  template,  the  principle  of  elicitation  is  firmly  in  place,  using  expert 
estimates  to  yield  quantitative  information  for  the  model’s  prior  parameters.  We  can  then  use  the 
procedure  from  Section  6.1,  starting  with  the  elicited  prior  distribution  as  specified  here. 

7.2  Covariate  Addition 

Rather  than  consider  the  expert  opinion  as  the  earlier  basis  for  the  model,  we  might  instead  treat 
the  expert  as  a  new  source  of  information.  If  experts  are  available  at  each  point  in  the  study  under 
question,  then  their  opinions  on  the  probability  of  a  deployment  on  any  particular  road  can  be 
added  as  covariates  in  the  model,  either  as  a  prediction  (0  or  1)  or  as  a  probability. 

The  disadvantages  with  this  approach  are  obvious:  it  requires  the  continuous  input  of  an  expert 
during  the  process  to  be  of  any  effect  and  it  does  allow  the  experts  to  calibrate  their  assessments 
against  one  another  and  against  the  predictive  strength  of  the  model.  The  direct  interaction  of 
these  multiple  sources  of  information  may  well  affect  the  very  estimates  trying  to  be  made,  rather 
than  treating  the  two  sources  as  distinct  (as  has  been  observed  in  expert  correlations  of  global 
warming  estimates). 

7.3  Model  Averaging  and  Linear  Bayes  Approaches 

Rather  than  worrying  about  directly  fusing  together  two  sources  of  information,  an  alternative  is 
to  teat  the  model’s  predictions  and  the  expert’s  as  two  (or  more)  separate  sources  to  be  averaged 
together.  This  approach  is  common  in  prediction,  and  can  be  verified  using  repeated  trials  on 
the  same  experts,  as  above;  sporting  events  (such  as  BCS  championships  in  college  football)  and 
elections  (such  as  the  popular  website  FiveThirtyEight.com)  are  both  frequently  predicted  using 
averages  of  expert  and  lay  opinions  plus  more  “objective”  data  models.  The  weight  applied  to 
each  predictor  is  adjusted  with  each  successive  time  period  or  event  of  note,  so  that  over  time  the 
predictions  should  improve  in  quality  assuming  all  underlying  assumptions  remain  true. 

Much  empirical  literature  suggests  that  model  averaging  may  be  far  from  optimal  when  none 
of  the  predictors  is  based  on  a  true  model  of  the  phenomenon  under  study  (e.g.  Geweke  and 
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Amisano  [2010]).  The  alternative  of  using  a  pooled  linear  Bayesian  predictor  for  this  problem  could 
benefit  from  careful  exploration,  such  as  the  family  of  methods  known  as  Bayesian  Model  Averaging 
[Raftery,  1995;  Raftery  et  al.,  1997]. 

8  Some  Additional  Extensions 

We  have  presented  some  standard  extensions  to  a  robust  and  flexible  modeling  approach  that 
may  be  used  for  prediction  in  this  kind  of  system;  however,  there  is  considerable  room  for  the 
development  and  extension  of  these  ideas.  We  mention  two  examples  of  tasks  as  obvious  next  steps 
for  this  line  of  research. 

8.1  Comprehensive  Data  Collection  and  Analysis 

While  we  could  continue  to  develop  these  modeling  ideas  on  simulated  data  sets  and  prototypical 
road  systems,  we  cannot  assess  adversarial  interest.  Nor  can  we  validate  these  modeling  assump¬ 
tions,  without  a  substantially  improved  data,  both  for  IED  placement  and  the  elicitation  of  experts. 
Prototypical  structures  can  prove  to  be  useful  to  develop  a  concept,  but  certainly  not  to  provide  a 
meaningful  illustration  for  policy  purposes. 

We  are  aware  that  this  kind  of  data  has  been  collected  for  military  purposes;  once  the  data  can 
be  processed  into  a  form  for  inclusion  in  the  proposed  model,  natural  refinements  and  extensions 
may  surface. 

8.2  Approximations  to  Canadian  Betweenness 

The  slowest  part  of  the  modeling  process  is  the  assessment  of  “Canadian  Betweenness” ,  which  we 
showed  in  Section  4.3  to  be  quite  time-intensive  to  compute,  making  this  impractical  for  larger 
networks  without  extensive  pre-processing.  If  the  measure  appears  to  be  a  useful  assessor  of  impor¬ 
tance  in  real  deployments,  then  an  approximation  to  this  measure  may  prove  to  be  more  useful  in 
practice  than  bothering  with  the  autoregressive  form  of  the  model  that  we  are  dealing  with  now,  as 
well  as  being  easier  to  handle  the  uncertainty  in  parameter  estimates  caused  by  the  autoregressive 
components. 
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A  Connected  Literature 


Here  we  list  a  number  of  literature  topics  that  do  not  directly  connect  to  the  methods  we  have 
used  in  this  paper,  but  are  nonetheless  related  to  the  subject  of  security  and  efficiency  on  transition 
along  a  graph. 

A.l  Vulnerabilities  of  Key-sharing  Graphs 

A  key-sharing  network  [Warren  et  al.,  2006]  is  formed  by  beginning  with  a  series  of  interconnected 
routers.  Each  router  has  a  set  of  alphanumeric  keys;  any  two  routers  that  have  at  least  one  key  in 
common  are  able  to  communicate  with  encrypted  signals,  and  are  hence  considered  connected.  The 
resulting  network  topology  is  that  when  the  sharing  properties  of  all  keys  are  taken  into  account 
simultaneously. 

What  defines  the  network  is  also  what  defines  its  vulnerability.  If  a  key  becomes  known  to  an 
outside  observer,  the  embedded  network  defined  by  that  key  is  compromised,  leaving  those  nodes 
open  to  attack  from  an  outside  adversary. 

Other  works  dealing  with  the  vulnerabilities  of  key-sharing  graphs  are  Eschenauer  and  Gligor 
[2002]  and  Chan  et  al.  [2003] ,  which  deals  with  mechanisms  for  generating  key  distributions  among 
nodes  to  create  secure  connection  patterns. 

These  works  contain  a  unique  perspective  on  security  connected  directly  to  the  topological 
properties  of  the  network,  one  that  is  somewhat  relevant  to  the  problem  at  hand.  In  this  case, 
adversarial  action  is  known  to  disable  a  series  of  nodes  and  their  associated  edges;  in  the  IED 
problem,  this  action  will  only  disable  an  edge  but  will  have  consequences  for  traffic  fiow  through 
the  edge’s  associated  nodes,  not  to  mention  others  through  the  change  in  traffic  patterns  in  the 
system. 

A. 2  Trace  Route  Sampling 

As  a  practical  procedure  for  discovering  network  ties,  traceroute  sampling  is  an  active  mechanism 
for  inferring  pathways  between  two  nodes  in  cases  where  explicit  ties  cannot  be  directly  investigated, 
and  intervening  nodes  can  only  be  identified  as  intermediates  in  route  traces.  However,  this  process 
is  biased  in  favor  of  identifying  central  nodes  and  inferring  incorrect  degree  distributions  Achlioptas 
et  al.  [2009] .  There  would  seem  to  be  a  minimal  connection  with  our  work  if  we  know  the  complete 
network  at  its  maximum  and  have  some  understanding  of  the  IED  placement  distribution. 

A. 3  Aerial  Detection  Systems 

Optimizing  any  traffic  route  according  to  the  overall  likelihood  of  IED  placement  requires  prior 
information  on  placement  strategies  as  well  as  the  inclusion  of  new  data  as  time  passes.  One 
particular  implementation  of  IED  detection  is  the  use  of  unmanned  aerial  systems  (UASes)  with 
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sensor  devices  to  detect  the  deployment  of  IEDs  along  anticipated  travel  routes  [Royset  and  Reber, 
2008].  The  approach  considered  by  the  authors  involves  a  grid-like  search  pattern,  so  that  only 
those  grid  areas  with  roads  used  by  friendly  forces  will  be  searched  as  part  of  the  deployment 
pattern. 

The  suggested  algorithm  for  implementing  this  approach  is  a  linear  programming  solution  that 
incorporates  wind  speed,  travel  time  and  prior  likelihood  of  deployment.  For  the  problem  at  hand, 
we  expect  that  aerial  reconnaissance  data  can  be  used  as  prior  information  in  the  vehicle  routing 
problem  and  vice  versa;  therefore,  for  the  scope  of  this  investigation,  we  do  not  consider  the 
immediate  detection  of  IEDs  by  UAS  methods  simultaneously  with  ground  transport  as  a  feasible 
(or  necessary)  research  direction. 
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